Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

نویسندگان

  • Su Nam Kim
  • Preslav Nakov
چکیده

Responding to the need for semantic lexical resources in natural language processing applications, we examine methods to acquire noun compounds (NCs), e.g., orange juice, together with suitable fine-grained semantic interpretations, e.g., squeezed from, which are directly usable as paraphrases. We employ bootstrapping and web statistics, and utilize the relationship between NCs and paraphrasing patterns to jointly extract NCs and such patterns in multiple alternating iterations. In evaluation, we found that having one compound noun fixed yields both a higher number of semantically interpreted NCs and improved accuracy due to stronger semantic restrictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interpreting noun compounds using paraphrases Interpretación de los compuestos nominales mediante paráfrasis

Noun compounds are abundant in English and their interpretation is crucial for many natural language processing tasks. We propose a method for automatic two-noun noun compound interpretation that searches for suitable paraphrases in static corpora and then issues Web search engine queries to validate them. Native speakers were recruited to evaluate the returned paraphrases for noun compounds: t...

متن کامل

Interpreting Noun Compounds using Bootstrapping and Sense Collocation

This paper describes a bootstrapping method for automatically tagging noun compounds with their corresponding semantic relations. Our work takes advantage of the collocation of senses of the noun compound constituents and also word similarity. We exploit this to generate a set of noun compounds from a set of previously tagged noun compounds by replacing one constituent of each noun compound wit...

متن کامل

Web-Scale Features for Full-Scale Parsing

Counts from large corpora (like the web) can be powerful syntactic cues. Past work has used web counts to help resolve isolated ambiguities, such as binary noun-verb PP attachments and noun compound bracketings. In this work, we first present a method for generating web count features that address the full range of syntactic attachments. These features encode both surface evidence of lexical af...

متن کامل

Linked Open Data and Web Corpus Data for noun compound bracketing

This research provides a comparison of a linked open data resource (DBpedia) and web corpus data resources (Google Web Ngrams and Google Books Ngrams) for noun compound bracketing. Large corpus statistical analysis has often been used for noun compound bracketing, and our goal is to introduce a linked open data (LOD) resource for such task. We show its particularities and its performance on the...

متن کامل

Standardised Evaluation of English Noun Compound Interpretation

We present a tagged corpus for English noun compound interpretation and describe the method used to generate them. In order to collect noun compounds, we extracted binary noun compounds (i.e. noun-noun pairs) by looking for sequences of two nouns in the POS tag data of the Wall Street Journal. We then manually filtered out all noun compounds which were incorrectly tagged or included proper noun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011